Semantic Annotation Layer in Russian National Corpus: Lexical Classes of Nouns and Adjectives

نویسندگان

  • Olga N. Lashevskaja
  • Olga Yu. Shemanaeva
چکیده

The paper describes the project held within Russian National Corpus (http://www.ruscorpora.ru). Beside such obligatory constituents of a linguistic corpus as POS (parts of speech) and morphological tagging RNC contains semantic annotation. Six classifications are involved in the tagging: category, taxonomy, mereology, topology, evaluation and derivational classes. The operating of the context semantic rules is shown by applying them to various polysemous nouns and adjectives. Our results demonstrate semantic tags incorporated in the context to be highly effective for WSD.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of context markers for Russian nouns

The research project presented in this paper aims at identification of context markers for Russian nouns and their use in construction identification. The body of contexts has been extracted from the Russian National Corpus (RNC). The context processing procedure takes into account the lexical and semantic information represented in the corpus annotation. Merged meaning of words are taken into ...

متن کامل

Using Semantic Annotations to Cluster Lexical Relationships

This report describes work that builds on the Semantic Annotation project from the 2003 Johns Hopkins University summer workshop 2003. This study investigates the automatic derivation of preferences for both adjectives and verbs from the 26 million word semantically annotated corpus produced as part of the workshop. This corpus was enhanced by identifying additional named entities in the text. ...

متن کامل

On the Role of Derivational Processes in the Formation of Non-Taxonomic Classes of Lexical Units in Russian

The paper is focused on classes of lexical units which arise as a result of derivational processes – word formation and semantic transfers, acting either in isolation or together, on the basis of common semantic foundations that bind targets and sources of derivation. The lexical items which constitute the classes under study vary in their denotative characteristics and due to their categ...

متن کامل

FrameBank: A Database of Russian Lexical Constructions

Russian FrameBank is a bank of annotated samples from the Russian National Corpus which documents the use of lexical constructions (e.g. argument constructions of verbs and nouns). FrameBank belongs to FrameNetoriented resources, but unlike Berkeley FrameNet it focuses more on the morphosyntactic and semantic features of individual lexemes rather than the generalized frames, following the theor...

متن کامل

Semantic Annotation of Verbs for the Tatar Corpus

This paper discusses the problem of developing the metalanguage for linguistic applications and introduces a tag set for the semantic annotation of verbs for the Tatar National Corpus. At present, there are no generally accepted standards for the development of corpus semantic annotation. In many cases, it is made by individual researchers or teams for one or another research project, and chara...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008